Skip to content

Conversation

@songkant-aws
Copy link
Contributor

@songkant-aws songkant-aws commented Jan 26, 2026

Description

Fix the bug discovered in #5054. See root cause description in #5054 (comment)

Related Issues

Resolves #5054

Check List

  • New functionality includes testing.
  • New functionality has been documented.
  • New functionality has javadoc added.
  • New functionality has a user manual doc added.
  • New PPL command checklist all confirmed.
  • API changes companion pull request created.
  • Commits are signed per the DCO using --signoff or -s.
  • Public documentation issue/PR created.

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Signed-off-by: Songkan Tang <songkant@amazon.com>
@coderabbitai
Copy link
Contributor

coderabbitai bot commented Jan 26, 2026

📝 Walkthrough

Summary by CodeRabbit

  • Bug Fixes

    • Boolean-field comparisons (TRUE/FALSE, NOT, !=) now produce correct term and mustNot filters, preserving null/missing semantics and improving pushed-down query accuracy.
    • Aggregation filter behavior adjusted to prefer boolean term filters where applicable.
  • Tests

    • Added unit and integration tests and expected-plan fixtures covering query_string, true/false, NOT and != pushdown scenarios.
    • Updated YAML-based integration expectations to use concise length-based validations for sample counts.

✏️ Tip: You can customize this high-level summary in your review settings.

Walkthrough

Detect and convert boolean field expressions earlier during predicate analysis and Calcite traversal to emit true/false term (or negated) queries; add unit tests, Calcite integration tests, a YAML REST test, and expected explain-plan YAMLs covering boolean pushdown cases.

Changes

Cohort / File(s) Summary
Predicate analysis & boolean helpers
opensearch/src/main/java/org/opensearch/sql/opensearch/request/PredicateAnalyzer.java
Add NamedFieldExpression.isBooleanType(); add QueryExpression.isFalse(), isNotFalse(), isNotTrue(); short-circuit boolean NamedFieldExpression to term queries and extend NOT/IS_TRUE/IS_FALSE handling.
Calcite boolean rewrites
core/src/main/java/org/opensearch/sql/calcite/CalciteRexNodeVisitor.java
Add boolean-aware rewrites for NOT and !=/<> on boolean fields to produce IS_NOT_TRUE/IS_NOT_FALSE when applicable; private helpers to extract boolean comparisons.
Opensearch unit tests
opensearch/src/test/java/org/opensearch/sql/opensearch/request/PredicateAnalyzerTest.java, opensearch/src/test/java/org/opensearch/sql/opensearch/request/AggregateAnalyzerTest.java
Extend test schema with a boolean field and add/asserts for IS_TRUE term generation and compound queries; update aggregate test expectation from script filter to term d = true.
Calcite integration tests
integ-test/src/test/java/org/opensearch/sql/calcite/remote/CalciteExplainIT.java
Add six explain tests covering boolean pushdown with query_string, TRUE/'TRUE', false, NOT true, and != true, comparing explain outputs to expected YAMLs.
Calcite expected explain plans
integ-test/src/test/resources/expectedOutput/calcite/...
explain_filter_query_string_with_boolean.yaml, explain_filter_query_string_with_boolean_false.yaml, explain_filter_query_string_with_boolean_not_true.yaml
Add expected logical/physical plans showing pushed-down boolean term filters (and must_not variants) combined with query_string in PushDownContext/OpenSearchRequestBuilder.
YAML REST integration test
integ-test/src/yamlRestTest/resources/rest-api-spec/test/issues/5054.yml
Add REST test fixture creating an index with a boolean field, bulk docs, enabling/disabling Calcite plugin, and assertions for is_internal=true/false and NOT variants.

Sequence Diagram(s)

sequenceDiagram
    participant Client as Client
    participant Planner as CalcitePlanner
    participant Rex as CalciteRexNodeVisitor
    participant Analyzer as PredicateAnalyzer
    participant QExpr as QueryExpression
    participant DSL as DSLGenerator

    Client->>Planner: submit SQL with boolean predicate
    Planner->>Rex: translate Rex nodes (compare / NOT)
    Rex->>Planner: rewrite != / NOT -> IS_NOT_* when applicable
    Planner->>Analyzer: analyzeExpression(filter)
    Analyzer->>Analyzer: detect NamedFieldExpression.isBooleanType()
    Analyzer->>QExpr: convert boolean field -> isTrue()/isFalse()/isNotTrue()/isNotFalse()
    QExpr->>DSL: emit TermQuery or must_not TermQuery (combined with query_string)
    DSL-->>Planner: return pushed-down DSL
    Planner-->>Client: explain/execute with pushed-down boolean term
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~45 minutes

Possibly related PRs

Suggested labels

calcite

Suggested reviewers

  • ps48
  • penghuo
  • anirudha
  • derek-ho
  • joshuali925
  • kavithacm
  • Swiddis
  • GumpacG
🚥 Pre-merge checks | ✅ 2 | ❌ 1
❌ Failed checks (1 warning)
Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 15.00% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (2 passed)
Check name Status Explanation
Title check ✅ Passed The title directly addresses the main change: fixing a bug related to boolean field comparison pushdown behavior during simplification.
Description check ✅ Passed The description clearly references issue #5054 and links to a detailed root cause analysis, directly relating to the changeset's bug fix purpose.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing touches
  • 📝 Generate docstrings

Comment @coderabbitai help to get the list of available commands and usage tips.

Signed-off-by: Songkan Tang <songkant@amazon.com>
@penghuo penghuo added bugFix PPL Piped processing language labels Jan 26, 2026
Content-Type: 'application/json'
ppl:
body:
query: source=test-boolean | where is_internal=true | fields name
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Using failed query source=test url=http | where is_internal=true
in #5054

Comment on lines 582 to 586
// Handle NOT(IS_TRUE(boolean_field)) - convert to term query with false value
// This covers cases where IS_TRUE was explicitly applied
if (expr instanceof SimpleQueryExpression simpleExpr && simpleExpr.isBooleanFieldIsTrue()) {
return QueryExpression.create(simpleExpr.rel).isFalse();
}
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • (NOT boolean_field = true) will return fields include ture, null and missing fields
  • but boolean_field=false only return fields has false value.

// generate a term query with value true.
// When called on an already-evaluated predicate (builder already set),
// return as-is.
if (builder == null) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible to override isTrue and not API for NamedFieldExpression instead of changing SimpleQueryExpression?

Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
Copy link
Contributor

@coderabbitai coderabbitai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 1

🤖 Fix all issues with AI agents
In `@integ-test/src/yamlRestTest/resources/rest-api-spec/test/issues/5054.yml`:
- Around line 1-15: The test uses an index named "test" and currently doesn't
clean it up; update the YAML to ensure index isolation by adding explicit delete
steps for the "test" index in both the setup and teardown blocks (or replace
"test" with a generated unique name), e.g., add a do: delete index action before
the test runs and another delete after the test completes so the index cannot
leak state or conflict with other tests; reference the existing setup/teardown
blocks and the index name "test" when making these changes.

Comment on lines +1 to +15
setup:
- do:
query.settings:
body:
transient:
plugins.calcite.enabled: true

---
teardown:
- do:
query.settings:
body:
transient:
plugins.calcite.enabled: false

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major

Ensure index isolation by cleaning up test before/after use.
Right now the test can fail or leak state if an index named test already exists or is reused. Add a cleanup step (or use a unique index name) to keep this test independent.

🧹 Suggested cleanup (align with existing YAML REST test patterns)
 setup:
   - do:
       query.settings:
         body:
           transient:
             plugins.calcite.enabled: true
+  - do:
+      indices.delete:
+        index: test
+        ignore: 404

 ---
 teardown:
   - do:
       query.settings:
         body:
           transient:
             plugins.calcite.enabled: false
+  - do:
+      indices.delete:
+        index: test
+        ignore: 404

As per coding guidelines: Tests must not rely on execution order; ensure test independence.

Also applies to: 23-34

🤖 Prompt for AI Agents
In `@integ-test/src/yamlRestTest/resources/rest-api-spec/test/issues/5054.yml`
around lines 1 - 15, The test uses an index named "test" and currently doesn't
clean it up; update the YAML to ensure index isolation by adding explicit delete
steps for the "test" index in both the setup and teardown blocks (or replace
"test" with a generated unique name), e.g., add a do: delete index action before
the test runs and another delete after the test completes so the index cannot
leak state or conflict with other tests; reference the existing setup/teardown
blocks and the index name "test" when making these changes.

Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
Signed-off-by: Songkan Tang <songkant@amazon.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

bugFix PPL Piped processing language

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] PPL where command does not work as expected.

3 participants